Identification of a biological form in the Anopheles stephensi laboratory colony using the odorant-binding protein 1 intron I sequence

Background Anopheles stephensi Listen (1901) is a major vector of malaria in Asia and has recently been found in some regions of Africa. The An. stepehnsi species complex is suspected to have three sibling species: type, intermediate, and mysorensis, each with its own vector competence to the malaria parasite and ecology. To identify the members of the species complex in our An. stephensi insectary colony, we used the morphological features of eggs and genetic markers such as AnsteObp1 (Anopheles stephensi odorant binding protein 1), mitochondrial oxidases subunit 1 and 2 (COI and COII), and nuclear internal transcribed spacer 2 locus (ITS2). Methods Eggs were collected from individual mosquitoes (n = 50) and counted for the number of ridges under stereomicroscope. Genomic DNA was extracted from female mosquitoes. After the amplification of partial fragments of AnsteObp1, COI, COII and ITS2 genes, the PCR products were purified and sequenced. Phylogenetic analysis was performed after aligning query sequences against the submitted sequences in GenBank using MEGA 7. Results The range of ridges number on each egg float was 12–13 that corresponds to the mysorensis form of An. stephensi. The generated COI, COII and ITS2 sequences showed 100%, 99.46% and 99.29% similarity with the sequences deposited for Chinese, Indian and Iranian strains of An. stephensi, respectively. All the generated AnsteObp1 intron I region sequences matched 100% with the sequences deposited for An. stephensi sibling species C (mysorensis form) from Iran and Afghanistan. Conclusions This manuscript precisely describes the morphological and molecular details of the ‘var mysorensis’ form of An. stephensi that could be exploited in elucidating its classification as well as in differentiation from other biotypes of the same or other anopheline species. Based on our findings, we recommend AnsteObp1 as a robust genetic marker for rapid and accurate discrimination (taxonomic identification) of the An. stephensi species complex, rather than the COI, COII, and ITS2 marker, which could only be utilized for interspecies (Anopheles) differentiation.


Methods
Eggs were collected from individual mosquito (n = 50) and counted for the number of ridges under stereomicroscope. Genomic DNA was extracted from female mosquitoes. After the amplification of partial fragments of AnsteObp1, COI, COII and ITS2 genes, the PCR products were purified and sequenced. Phylogenetic analysis was performed after aligning query sequences against the submitted sequences in GenBank using MEGA vx.

Results
The range of ridges number on each egg float was 12-13 that corresponds to the mysorensis form of An. stephensi. The generated COI, COII and ITS2 sequences showed 100%, 99.46% and 99.29% similarity with the sequences deposited for Chinese, Indian and Iranian strains of An. stephensi, respectively. All the generated AnsteObp1 intron I region sequences matched 100% with the sequences deposited for An. stephensi sibling C (mysorensis form) from Iran and Afghanistan.

Conclusions
This manuscript precisely describes the morphological and molecular details of the 'var mysorensis' form of An. stephensi that could be exploited in elucidating its classification as well as in differentiation from other biotypes of the same or other anopheline species. Based on our findings, we recommend AnsteObp1 as a robust genetic marker for rapid and accurate discrimination (taxonomic identification) of the An. stephensi species complex, rather than the COI, COII, and ITS2 marker, which could only be utilized for interspecies (Anopheles) differentiation.

Background
A wide variety of medically important insects belong to cryptic species complexes, which are morphologically identical (isomorphic), but reproductively isolated and have different seasonal prevalence, host preference, infection rates, resting habits, and biting cycles [1,2,3]. For instance, around 70 out of 482 species of anopheline mosquitoes act as vectors for malaria parasites and nearly 30 complexes of these have been identified so far in the world [4,5,6]. Because of the discovery of new biological species, the number of Anopheles vector species is rapidly rising [4,5,7,8]. The available information regarding the biology and distribution of An. gambiae, An. culicifacies, and An. dirus complexes in Africa, India, and Thailand, respectively, has demonstrated the importance of identifying the members of these species complexes [5]. Failure to discriminate among the vector and non-vector sibling species of anopheline species complexes may seriously mislead the malaria epidemiological mapping and the subsequent vector control strategies [5]. Lack of adequate knowledge about vector species complexes is playing a significant role in the current worsening scenario of human malaria in the Asian-Pacific area [9], with worldwide malaria cases rising from 217 million in 2016 to 219 million in 2017 and around 229 million in 2019.
An. stephensi is one of the dominant malaria vectors in Middle East, the Indian subcontinent, Iran, Iraq, Bangladesh, south China, Myanmar, Thailand and Ethiopia [9,11,12]. Based on egg morphometric analysis, An. stephensi has three biological forms i.e. mysorensis, intermediate and type form [3] and these were identified as sibling species as Species C, Species B and Species A respectively [3,5]. The type biological form is an efficient vector of malaria in urban areas [13] whereas mysorensis is a poor vector (highly zoophilic) and limited only to rural areas, although it is susceptible to Plasmodium vivax (VK210B) [3,8,14]. The intermediate biological form is reported from rural and peri-urban areas with very little information about its vectorial capacity [6,15]. Despite efficient controlling strategies for malaria, An. stephensi is increasing in its geographic range. Thus, there is a dire need for the precise identification of members of the Anopheles stephensi and also for the members of the other Anopheles complexes which is crucial in malaria surveillance, effective control, and elimination strategies [5,15].
Information regarding population genetics of An. stephensi is still limited [16,17]. The mitochondrial oxidases subunit 1 and 2 (COI and COII), ribosomal internal transcribed spacer 2 (rDNA-ITS2) and domain-3 (D3) loci are the common molecular markers used, but none of them have distinguished accurately the biological forms of An. stephensi [15,17,18]. Alternatively, intron I sequences of Odorant-binding protein 1 has been recently shown to be potential genetic marker to differentiate members of the An. stephensi complex [15]. Accurate identification, the spatial distribution and population dynamics of cryptic species of the An. stepehnsi complex has major human health implications since it directly impacts the vector control and disease management strategies [4,5].
Consequently, this study was designed with the following objectives: (i) to assess the potential of COI and COII, and ITS2 (routinely used markers) genes variations for reconstructing the phylogeny and recognition of cryptic species of An. stephensi in our insectary (ii) to demonstrate (as a secondary evidence) the AnsteObp1 intron I sequence a robust marker for rapid and accurate identification of An. stephensi and (iii) finally to introduce an optimized and easy protocol for sibling species identification of An. stephensi, based on the current molecular and morphological data.
It is indispensable to accurately characterize the insectary colony that could be used in vector control strategies such as Wolbachia-based, and Gene drive, etc. These developing technologies are becoming more popular and important for vector population replacement/suppression, but they are highly species-specific. The preliminary sequence data (associated with mysorensis) generated through this study may contribute well to the knowledge and reliable identification (taxonomic and phylogenetics) of the mysorensis form of An. stephensi.

Colony maintenance
The colony of An. stephensi (Hor strain) has been maintained over 6 years in the insectary in 30×30×30 cm cages at Sun Yat-sen University, Guangzhou, Guangdong Province of China.
Originally, this species was obtained with the courtesy of Wen-Yue Xu from Department of Pathogenic biology, Third Military Medical University, Chongqing China [19]. The rearing conditions were 28 ± 2 ℃, 70 ± 5 % RH, and a 12:12 (L: D) h photoperiod with a 10% (W/V) sugar solution. Plastic trays (30 × 40 × 8 cm) were used for larvae rearing with deionized water and fed with IAEA 2 larval food in accordance with the standard explained procedure [20].

Mosquito feeding, collection and morphological study of eggs
After 5-7 days of adult emergence, the female mosquitoes were allowed to feed on anesthetized white mice (Kunming strain) for 30 minutes to start egg development. After blood feeding, about 50 engorged females were randomly isolated and kept in individual properly labeled plastic tubes (50 mL) (one mosquito/tube) with a dump paper at the bottom of each tube for eggs collection.
The tubes were provided with cotton soaked in 10% sugar solution. After three days, the adult females were processed further for molecular analysis. About 50 eggs were mounted on slide (each time) with a drop of water and examined under stereomicroscope with 40 × (bright field illumination) magnification to count the number of ridges on eggs (one side) as described previously [3,16].

DNA extraction and PCR
After egg laying process, individual female mosquito from each tube was processed for DNA extraction using Dongsheng Biotech DNA extraction kit according to the manufacturer's instructions. Briefly, one mosquito was taken in 1.5 mL tube containing around 500 µL STE buffer and a small steel ball, and homogenized (50 Hzs for 30-60 seconds). Then 5 µL of this grinding solution (for each sample) was mixed with 18 µL of DNA extraction solution in a PCR tube, mixed well and incubated for 2 minutes at room temperature. Samples were processed for PCR with thermal condition at 95℃ for 10 minutes. Afterwards, 2 µL of neutralizing fluid was added to each PCR product, mixed well and incubated for several minutes at room temperature.
Finally, the extracted DNA was either kept at -20 ℃ or immediately processed for further amplification of target gene.
Amplification of COI, COII, ITS2, and AnsteObp1 fragments PCR was performed for individual mosquito (n = 50) to amplify COI, COII, ITS2 and AnsteObp1 partial genes. The PCR reactions were carried out in a 25 μL volume and the details of used primers and PCR conditions for each marker are presented in Table 1. Double distilled H2O was used as negative control instead of template DNA in PCR reactions. The PCR products were purified by using TaKaRa agarose Gel DNA extraction Kit (Japan) and the amplicons were subsequently sequenced bi-directionally (both directions) using Sanger sequencing technology.

Sequence analysis and phylogenetic tree construction
The sequences were trimmed to remove any primer or other nucleotide contamination and double checked with Chromas software version 2.31 (www.technelysium.com.au/chromas.html).
The final sequences were aligned using ClustalW [21] with the homologous sequences downloaded from GenBank and phylogenetic trees were constructed using distance Neighborjoining and maximum likelihood Methods based on the Tamura-Nei model in MEGA7 [22].
A 120 bp fragment of AnsteObp1 intron I region from sequenced specimens selected from 845 bp sequenced region was used for analysis. An. stephensi sibling species sequences, A (KJ557463), B (KJ557452), C (KJ557455) [15] were used as representative for sequence comparisons and phylogenetic tree construction.

Morphological analysis of the eggs
A total of 500 eggs were examined. We detected uniformity in all the observed eggs with ridges number 12-13 per egg [ Fig. 1]. Based on the previously reported range of egg ridges, our laboratory mosquito colony was identified as the mysorensis biological form of An. stephensi.
The previously defined criteria for identifying these three biological forms on the basis of ridges GenBank, of them, seven sequences (from Iran, India, China and Brazil) were compatible with our lab strain and included in the sequence analysis ( Fig. 2 and supplementary fig 1). Multiple sequence alignment showed that the similarity between lab strain COI sequence and GenBank sequences was 99.87-100%. There were 7 mismatches in COI sequences as a transversion and 6 transitions ( Fig. 2 and supplementary fig 1). Interestingly, a sequence directly submitted from China [24] was 100% similar to our lab strain COI sequence (Fig. 2). Interestingly, COI sequence of An. stephensi from Iran, India, Brazil and China distributed in three different clades in a phylogenetic tree (supplementary fig 1) (it will be because the sequences were from An. stephensi sensu lato). Our lab strain sequence was placed with Chinese An. stephensi sequence in the same clade.

Phylogenetic analysis of mitochondrial oxidase subunit II (COII)
The COII sequences of An. stephensi (n = 24) extracted from GenBank were compared with our lab strain sequences (n=9). Four sequences were excluded because of shorter sequence size.  2). The similarities between our sequences and 20 An. stephensi COII sequences available in GenBank deposited from different countries were 98.75-99.82% (Fig. 3   and supplementary fig 2). Less than 2% variation was because of 13 mismatches as transition (n = 11) and transversion (n = 2) ( Fig. 3 and supplementary fig 2). Interestingly, mismatches in 195, 454, and 547 positions were specific to lab strain (Supplementary fig 2). Phylogenetic tree constructed based on An. stephensi COII sequences categorized the sequences in 3 clades (Fig.   3). Lab strain sequence was placed in a separate clade together with An. stephensi COII sequences from India and Iran (Fig 3).
These sequences were previously submitted from Iran, India, Iraq, Saudi Arabia and Sri Lanka.
Eleven sequences from Sri Lanka were partial and therefore, excluded from the study. Finally, 37 representative rDNA-ITS2 sequences from GenBank and sequences obtained in the current study (n=7; MW017363 and MW017364, MZ269267-MZ269271) (470bp) were used for analysis and phylogenetic tree construction ( Fig. 4 and supplementary fig 3). Comparisons of our new lab strain sequences showed 98.71% similarity with each other, while they were randomly selected from the same colony. BLAST analysis of obtained sequences showed 97.63-100% similarities with sequences reported from Iran, India, Iraq and Saudi Arabia [Fig 4].
Interestingly, a sequence from India, HQ703001, showed 82.19-82.97% similarity with other rDNA-ITS2 sequences of An. stephensi, while its similarity with AY702482 from Iran was 98.28-99.79% (Fig. 4 and supplementary fig 3). The similarity among Lab sequences with others was 97.63-99.57% (Fig. 4). The topology of phylogenetic tree based on rDNA-ITS2 sequences of An. stephensi was similar to COI and COII having 3 clades with lower bootstrap values for clades (Supplementary fig. 3).
The AnsteObp1intron I sequences obtained in this study were clustered in a separate clade together with An. stephensi sibling C (mysorensis) in a based phylogenetic tree (Fig 6). As shown previously, [15,16] An. stephensi sibling species A and B were placed in the separate clades (Fig. 6).
Recently, AnsteObp1 is reported as a new marker for identification of the Asian main malaria vector, An. stephensi. In the current study, we assessed the effectiveness of commonly used (COI, COII, and ITS2) and novel (AnsteObp1) markers, as well as morphological features (egg ridge counts), in identifying the biological form of An. stephensi. The extensive morphological analysis of our mosquito eggs showed that ridges number in the range 12-13/egg corresponds to the mysorensis form of An. stephensi [3,23]. Our results are in accordance with the previously reported range of ridges number (mysorensis) i.e. 11 to 14 [Nagpal et al., 2003], 13-14 , 10-14 [Subbarao et al., 1987]. Similarly, the phylogenetic analysis (Fig. 6) of AnsteObp1 sequences of our mosquito strain showed 100% similarity with sibling species C reported from Iran and Afghanistan (the neighboring country of China). Although the current study has a limitation in that we only used the available lab strain of An. stephensi (wild mosquitoes were not available because of strict vector control measures in China), in a previous study [16], the entire wild collection of three strains of An. stephensi was successfully discriminated using AnsteObp1. This identification was attributed to the form specific/associated mutations (4-15%) within the intron region of AnsteObp1 between biological forms but no significant variation was noticed within the biological forms [16]. There was 99-100% similarity in the amino acids sequences of AnsteObp1among these members, with a single substitution (non-synonymous) in the type form [16]. Taking together, our mysorensis associated preliminary sequence data may be exploited as representative/reference sequences for sequence comparisons and phylogenetic tree construction in similar studies in the future. Finally, our investigations, based on egg morphology and sequence analysis, endorse the use (independent) of the AnsteObp1 intron I sequence as a new molecular tool for quick and reliable identification of all the three biological forms of An. stephensi.
Vector control is fundamental for preventing the spread of malaria. Understanding population genetic structure of mosquito is imperative for shaping prevention strategies, particularly between mysorensis and type strains reported a definite incompatibility [28]. Others demonstrated variations in the reproductive capacity within these biological forms of An.
stephensi [34]. In contrast, no hybrid sterility was reported during a type-mysorensis cross experiments [27]. Subbarao et al (1987) did not find sterility in crossing experiments between the laboratory strains of the three ecological forms/biological forms [3]. As a result, these experiments provide perplexing results, which should be repeated after precise recognition of members of this species complex using effective genetic marker, such as AnsteObp1, and following the techniques outlined here [ Table 1]. Further, there is a dire need to see whether premating (reproductive isolation) barriers exist among these forms in the field. Accurate identification and exploring mating compatibility/incompatibility of the wild populations with the released (lab) strain is essential for the successful operation and field application of new emerging technologies i.e. Wolbachia-based that has been implemented for the suppression/replacement of wild Aedes mosquito population in a dozens of countries to control dengue and Zika viruses [35]. If both the wild and lab (Wolbachia infected) mosquito populations are compatible, this approach could be used to eradicate malaria vectors in the wild.
Regarding the suitability of ITS2, COI and COII for distinguishing the biotypes of An. stephensi, our current observations and previous studies [17,18] confirm that these markers are not the suitable markers (based on high sequence similarity). In contrast, a new study reports both COI and COII (gene variation) as suitable markers to recognize the complexes of An. gambiae and An. albitarsis [36]. Yet our BLAST searches at GenBank database for either of ITS2, COI and COII sequences once again revealed them inappropriate for distinguishing our species strain. Our results are in accordance with [13,18,37,37]. Here, the phylogenetic analysis for COI, COII and ITS2 indicated our species 100%, 99.46% and 99.29% similar to other Chinese, Indian and Iranian strains of An. stephensi (Figs. 2, 3 and 4). Consequently, this indicates that aforesaid markers could be recommended only for identifying the species of An. stephensi (interspecies of Anopheles mosquito). Thus the independent use of AnsteObp1 and associated protocols mentioned in this study are recommended to be used in future investigations that involve distinguishing the members of An. stephensi (intra-species variation).

Conclusion
This study finds AnsteObp1 as a robust genetic marker for the identification of members of the